Highly accurate children's speech recognition for interactive reading tutors using subword units
نویسندگان
چکیده
Speech technology offers great promise in the field of automated literacy and reading tutors for children. In such applications speech recognition can be used to track the reading position of the child, detect oral reading miscues, assessing comprehension of the text being read by estimating if the prosodic structure of the speech is appropriate to the discourse structure of the story, or by engaging the child in interactive dialogs to assess and train comprehension. Despite such promises, speech recognition systems exhibit higher error rates for children due to variabilities in vocal tract length, formant frequency, pronunciation, and grammar. In the context of recognizing speech while children are reading out loud, these problems are compounded by speech production behaviors affected by difficulties in recognizing printed words that cause pauses, repeated syllables and other phenomena. To overcome these challenges, we present advances in speech recognition that improve accuracy and modeling capability in the context of an interactive literacy tutor for children. Specifically, this paper focuses on a novel set of speech recognition techniques which can be applied to improve oral reading recognition. First, we demonstrate that speech recognition error rates for interactive read aloud can be reduced by more than 50% through a combination of advances in both statistical language and acoustic modeling. Next, we propose extending our baseline system by introducing a novel token-passing search architecture targeting subword unit based speech recognition. The proposed subword unit based speech recognition framework is shown to provide equivalent accuracy to a whole-word based speech recognizer while enabling detection of oral reading events and finer grained speech analysis during recognition. The efficacy of the approach is demonstrated using data collected from children in grades 3–5, namely 34.6% of partial words with reasonable evidence in the speech signal are detected at a low false alarm rate of 0.5%. 2007 Elsevier B.V. All rights reserved.
منابع مشابه
Data driven subword unit modeling for speech recognition and its application to interactive reading tutors
This paper proposes a novel token-passing search architecture for supporting subword unit based speech recognition and a corresponding algorithm based on the well-known LZW text compression method to determine a vocabulary of subword units in an unsupervised manner. We compare our subword unit selection algorithm to an existing approach based on Minimum Description Length (MDL) modeling and als...
متن کاملAre Initial / Final Units Acoustically Accurate ?
| We show a comparative study of subword unit segmentation of Mandarin speech data. Most HMM recognition systems use intial//nals as subword units for Mandarin speech. We nd that such a division of monosylla-ble data into intial//nal units are not always supported by acoustic evidences. We implement a delta MFCC based seg-mentation method and compare its output with that of Viterbi segmentation...
متن کاملAutomatic assessment of children's reading with the FLaVoR decoding using a phone confusion model
Reading skills of children can be improved with the help of automatic reading tutors (ART), i.e. interactive software with an appealing interface which supports and challenges the child in the reading task, provides instantaneous feedback and automatically assesses its reading skills. For this purpose, ARTs benefit from automatic speech recognition technology for tracking the child’s responses ...
متن کاملAutomatic generation of subword units for speech recognition systems
Large vocabulary continuous speech recognition (LVCSR) systems traditionally represent words in terms of smaller subword units. Both during training and during recognition, they require a mapping table, called the dictionary, which maps words into sequences of these subword units. The performance of the LVCSR system depends critically on the definition of the subword units and the accuracy of t...
متن کاملItalian children's speech recognition for advanced interactive literacy tutors
This work was conducted with the specific goals of developing improved recognition of children’s speech in Italian and the integration of the children’s speech recognition models into the Italian version of the Colorado Literacy Tutor platform. Specifically, children’s speech recognition research for Italian was conducted using the ITC-irst Children’s Speech Corpus. Using the University of Colo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 49 شماره
صفحات -
تاریخ انتشار 2007